On Finding Rank Regret Representatives

نویسندگان

چکیده

Selecting the best items in a dataset is common task data exploration. However, concept of “best” lies eyes beholder: Different users may consider different attributes more important and, hence, arrive at rankings. Nevertheless, one can remove “dominated” and create “representative” subset data, comprising “best items” it. A Pareto-optimal representative guaranteed to contain item each possible ranking, but it be large portion data. much smaller found if we relax requirement including for user instead just limit users’ “regret.” Existing work defines regret as loss score by limiting consideration full dataset, any chosen ranking function. often not meaningful number, understand its absolute value. Sometimes small ranges include fractions dataset. In contrast, do notion rank ordering. Therefore, items’ positions ranked list defining propose rank-regret minimal containing least top- k This problem polynomial time solvable two-dimensional space NP-hard on three or dimensions. We design suite algorithms fulfill purposes, such whether relaxation permitted , result size, both, distribution known, theoretical guarantees practical efficiency important, so on. Experiments real datasets demonstrate that efficiently find subsets with rank-regrets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RRR: Rank-Regret Representative

Selecting the best items in a dataset is a common task in data exploration. However, the concept of “best” lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove “dominated” items and create a “representative” subset of the data set, comprising the “best items” in it. A Pareto-optim...

متن کامل

Finding Diverse, High-Value Representatives on a Surface of Answers

In many applications, the system needs to selectively present a small subset of answers to users. The set of all possible answers can be seen as an elevation surface over a domain, where the elevation measures the quality of each answer, and the dimensions of the domain correspond to attributes of the answers with which similarity between answers can be measured. This paper considers the proble...

متن کامل

BestTime: Finding Representatives in Time Series Datasets

Given a set of time series, we aim at finding representatives which best comprehend the recurring temporal patterns contained in the data. We demonstrate BestTime, a Matlab application that uses recurrence quantification analysis to find time series representatives.

متن کامل

Finding representatives in a large dataset of spectral reflectances

We propose a new method to construct representative spectra from a large database of spectral reflectances. The key is the optimisation of a Support Vector type functional. The representatives are constructed such that they sit at positions of high density in the set of spectra. At the same time they are constructed to be as orthogonal as possible. The representatives are expressible as a linea...

متن کامل

Finding Minimal Length Representatives in Thompson’s Group F

Cleary and Taback devised a method called the nested traversal method to construct minimal length representatives for positive and negative elements in Thomspson’s group. We show how to use the nested traversal method to construct minimal length representatives for a larger class of elements of this

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Database Systems

سال: 2022

ISSN: ['1557-4644', '0362-5915']

DOI: https://doi.org/10.1145/3531054